Skip to content

Handle invalid numeric entity references individually#181

Draft
CodingFeng101 wants to merge 1 commit into
un33k:masterfrom
CodingFeng101:codex/preserve-valid-numeric-entities
Draft

Handle invalid numeric entity references individually#181
CodingFeng101 wants to merge 1 commit into
un33k:masterfrom
CodingFeng101:codex/preserve-valid-numeric-entities

Conversation

@CodingFeng101

Copy link
Copy Markdown

Summary

  • convert decimal and hexadecimal numeric character references independently
  • keep invalid numeric references unchanged so later slug cleanup can handle them
  • add regression coverage for mixed valid and invalid numeric references

Why

The previous implementation wrapped the whole regex substitution in one try block. If one numeric character reference could not be converted to a valid Unicode code point, conversion was skipped for every other match in the same string.

For example, Ž � produced 381-9999999999, even though the first reference is valid and should become z.

Checks

  • python -m pytest -q -> 84 passed
  • python -m mypy -> Success: no issues found in 5 source files
  • python -m pycodestyle --ignore=E128,E261,E225,E501,W605 slugify test.py setup.py
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant